Overview

Dataset Statistics

Number of Variables 11
Number of Rows 5109
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.8 MB
Average Row Size in Memory 372.2 B
Variable Types
  • Categorical: 8
  • Numerical: 3

Dataset Insights

hypertension has constant length 1 Constant Length
heart_disease has constant length 1 Constant Length
Residence_type has constant length 5 Constant Length
stroke has constant length 1 Constant Length

Variables

gender

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 350.1 KB

Length

Mean 5.172
Standard Deviation 0.9852
Median 6
Minimum 4
Maximum 6

Sample

1st row Male
2nd row Female
3rd row Male
4th row Female
5th row Female

Letter

Count 26424
Lowercase Letter 21315
Space Separator 0
Uppercase Letter 5109
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Female, Male) take over 50.0%

age

numerical

Approximate Distinct Count 104
Approximate Unique (%) 2.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79.8 KB
Mean 43.23
Minimum 0.08
Maximum 82
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • age is skewed left (γ1 = -0.1374)

Quantile Statistics

Minimum 0.08
5-th Percentile 5
Q1 25
Median 45
Q3 61
95-th Percentile 79
Maximum 82
Range 81.92
IQR 36

Descriptive Statistics

Mean 43.23
Standard Deviation 22.6136
Variance 511.3738
Sum 220862
Skewness -0.1374
Kurtosis -0.9911
Coefficient of Variation 0.5231

hypertension

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 329.3 KB
  • The largest value (0) is over 9.26 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 5109
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 9.26 times larger than the second largest value (1)
  • hypertension has words of constant length

heart_disease

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 329.3 KB
  • The largest value (0) is over 17.51 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 0
3rd row 1
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 5109
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 17.51 times larger than the second largest value (1)
  • heart_disease has words of constant length

ever_married

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 337.6 KB
  • The largest value (Yes) is over 1.91 times larger than the second largest value (No)

Length

Mean 2.6563
Standard Deviation 0.475
Median 3
Minimum 2
Maximum 3

Sample

1st row Yes
2nd row Yes
3rd row Yes
4th row Yes
5th row Yes

Letter

Count 13571
Lowercase Letter 8462
Space Separator 0
Uppercase Letter 5109
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Yes, No) take over 50.0%
  • The largest value (yes) is over 1.91 times larger than the second largest value (no)

work_type

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 365.4 KB
  • The largest value (Private) is over 3.57 times larger than the second largest value (Self-employed)

Length

Mean 8.2464
Standard Deviation 2.1422
Median 7
Minimum 7
Maximum 13

Sample

1st row Private
2nd row Self-employed
3rd row Private
4th row Private
5th row Self-employed

Letter

Count 40633
Lowercase Letter 36211
Space Separator 0
Uppercase Letter 4422
Dash Punctuation 819
Decimal Number 0
  • The top 2 categories (Private, Self-employed) take over 50.0%
  • The largest value (private) is over 3.57 times larger than the second largest value (selfemployed)

Residence_type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 349.2 KB

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row Urban
2nd row Rural
3rd row Rural
4th row Urban
5th row Rural

Letter

Count 25545
Lowercase Letter 20436
Space Separator 0
Uppercase Letter 5109
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Urban, Rural) take over 50.0%
  • Residence_type has words of constant length

avg_glucose_level

numerical

Approximate Distinct Count 3978
Approximate Unique (%) 77.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79.8 KB
Mean 106.1404
Minimum 55.12
Maximum 271.74
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • avg_glucose_level is skewed right (γ1 = 1.5724)

Quantile Statistics

Minimum 55.12
5-th Percentile 60.712
Q1 77.24
Median 91.88
Q3 114.09
95-th Percentile 216.304
Maximum 271.74
Range 216.62
IQR 36.85

Descriptive Statistics

Mean 106.1404
Standard Deviation 45.285
Variance 2050.7316
Sum 542271.3
Skewness 1.5724
Kurtosis 1.6789
Coefficient of Variation 0.4267
  • avg_glucose_level is not normally distributed (p-value 0.0030779599429317404)
  • avg_glucose_level has 627 outliers

bmi

numerical

Approximate Distinct Count 419
Approximate Unique (%) 8.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79.8 KB
Mean 28.8946
Minimum 10.3
Maximum 97.6
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • bmi is skewed right (γ1 = 1.0761)

Quantile Statistics

Minimum 10.3
5-th Percentile 17.7
Q1 23.8
Median 28.4
Q3 32.8
95-th Percentile 42.66
Maximum 97.6
Range 87.3
IQR 9

Descriptive Statistics

Mean 28.8946
Standard Deviation 7.6982
Variance 59.2628
Sum 147622.3065
Skewness 1.0761
Kurtosis 3.6181
Coefficient of Variation 0.2664
  • bmi is not normally distributed (p-value 1.7489947120465485e-06)
  • bmi has 126 outliers

smoking_status

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 374.6 KB

Length

Mean 10.0814
Standard Deviation 3.3156
Median 12
Minimum 6
Maximum 15

Sample

1st row formerly smoked
2nd row never smoked
3rd row never smoked
4th row smokes
5th row never smoked

Letter

Count 48730
Lowercase Letter 47186
Space Separator 2776
Uppercase Letter 1544
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (never smoked, Unknown) take over 50.0%

stroke

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 329.3 KB
  • The largest value (0) is over 19.52 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 5109
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 19.52 times larger than the second largest value (1)
  • stroke has words of constant length

Interactions

Correlations

Missing Values